Goto

Collaborating Authors

 Des Plaines


SEAP: Training-free Sparse Expert Activation Pruning Unlock the Brainpower of Large Language Models

Liang, Xun, Wang, Hanyu, Lai, Huayi, Niu, Simin, Song, Shichao, Yang, Jiawei, Zhao, Jihao, Xiong, Feiyu, Tang, Bo, Li, Zhiyu

arXiv.org Artificial Intelligence

Large Language Models have achieved remarkable success across various natural language processing tasks, yet their high computational cost during inference remains a major bottleneck. This paper introduces Sparse Expert Activation Pruning (SEAP), a training-free pruning method that selectively retains task-relevant parameters to reduce inference overhead. Inspired by the clustering patterns of hidden states and activations in LLMs, SEAP identifies task-specific expert activation patterns and prunes the model while preserving task performance and enhancing computational efficiency. Experimental results demonstrate that SEAP significantly reduces computational overhead while maintaining competitive accuracy. Notably, at 50% pruning, SEAP surpasses both WandA and FLAP by over 20%, and at 20% pruning, it incurs only a 2.2% performance drop compared to the dense model. These findings highlight SEAP's scalability and effectiveness, making it a promising approach for optimizing large-scale LLMs.


Measuring and Reducing LLM Hallucination without Gold-Standard Answers via Expertise-Weighting

Wei, Jiaheng, Yao, Yuanshun, Ton, Jean-Francois, Guo, Hongyi, Estornell, Andrew, Liu, Yang

arXiv.org Artificial Intelligence

LLM is known to provide factually inaccurate information that appears to be confident, i.e. hallucination. It is currently a major obstacle to the reliability and trustworthiness of LLM [13, 34, 21]. An essential step towards solving this problem is measuring hallucinations. However, this is challenging from a data perspective as existing metrics presume that benchmark datasets posses gold-standard answers, i.e. "best" or "correct" answers written by humans [16]. The requirement of such answers imposes two fundamental limitations on hallucination measurement: 1) hiring human annotators to produce gold-standard answers is costly in both time and money [4, 43, 38]; 2) gold-standard answers are prone to natural human errors [7, 6, 49]. To this end, we take a step forward and propose a framework which measures the LLM hallucinations without the requirement of gold-standard answers. Our framework is partially inspired by the literature on learning with noisy labels [23, 18, 19], where there are no ground-truth labels for verifying the quality of imperfect human annotations [43, 38, 20], detecting annotation errors [48, 26, 47], or training models robustly [42, 3, 17, 36, 39]. Our basic idea is simple: leveraging off-the-shelf and high-quality LLMs to generate answers that serve as a proxy for gold-standard answers. The primary challenge in such an approach is how to properly weigh the expertise of each LLM for a given question x, without a priori knowledge of the true (i.e.



Applied AI News

Blanchard, David

AI Magazine

Blue Cross/Blue Shield of Virginia AT&T's Merrimack Valley Works The US Army Laboratory Command's (Richmond, VA) has developed an (North Andover, MA) has developed Human Engineering Laboratory expert system to classify, evaluate the Expert Capacity and Material (Aberdeen Proving Ground, MD) has and process medical claims. The system, System (XCAM), an expert system awarded a $2.4 million contract to called MedScreen, reportedly which simplifies forecast evaluations Carnegie Group (Pittsburgh, PA) to can process up to 500 claims in 45 for a manufacturing operation The continue work on a knowledge-based minutes, an operation that used to system automates the analysis of logistics planning system. The system take several days to complete. The IBM (Armonk, NY) and Dragon Systems NRM has been successfully deployed ICL (Birmingham, England) has completed (Newton, MA) have jointly in a number of Australian banks, as a pilot test of an intelligent developed VoiceType, a speech recognition well as a food storage and distribution system for field service diagnosing system based on elements of center. ICL used a laptop-based allows hands-free typing.